Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements a new dartboard summarize CLI subcommand that gathers metrics, profiles, and resource counts from deployed clusters. The implementation includes three standalone Go tools (collect-profile, export-metrics, and resource-counts) that are built and executed by the summarize command. The PR also adds functionality to update the monitoring project annotation during deployment.
Changes:
- Added
dartboard summarizesubcommand to collect diagnostics from clusters - Implemented three new diagnostic tools in Go with corresponding build scripts
- Updated deployment process to annotate cattle-monitoring-system namespace with project ID
- Added unused GetProject function to kubectl package
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 24 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/dartboard/main.go | Registers the new summarize subcommand |
| cmd/dartboard/subcommands/summarize.go | Implements summarize logic: builds and runs diagnostic tools |
| cmd/dartboard/subcommands/deploy.go | Adds updateMonitoringProject function to annotate monitoring namespace |
| internal/kubectl/kubectl.go | Adds GetProject helper function (unused) |
| summarize-tools/collect-profile/* | Go tool and build script for collecting heap/CPU profiles |
| summarize-tools/export-metrics/* | Go tool, build script, and pod manifest for exporting Prometheus metrics |
| summarize-tools/resource-counts/* | Go tool, build script, and legacy bash script for counting Kubernetes resources |
| .gitignore | Adds patterns to ignore generated binaries and result directories |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| containers: | ||
| - name: mimirtool | ||
| image: grafana/mimirtool:2.13.0 | ||
| command: ["/bin/sh", "-c"] | ||
| args: | ||
| - | | ||
| echo "Mimirtool pod is running. Use 'kubectl exec' to run commands." | ||
| # Keep the container running | ||
| while true; do | ||
| sleep 30 | ||
| done No newline at end of file |
There was a problem hiding this comment.
The mimirtool pod runs an infinite loop with a 30-second sleep, which will consume resources indefinitely. Consider adding resource limits (requests/limits for CPU and memory) to this pod specification to prevent it from consuming excessive cluster resources.
| // Complex bash command inside exec needs careful wrapping | ||
| metricsCmd := `curl -s -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" -k https://127.0.0.1/metrics` |
There was a problem hiding this comment.
The metrics command uses 'curl -k' (insecure, skipping TLS verification) to fetch metrics from https://127.0.0.1/metrics. While this is localhost and might be acceptable in a testing/development context, consider documenting why TLS verification is skipped or using a more secure alternative if possible.
| // Complex bash command inside exec needs careful wrapping | |
| metricsCmd := `curl -s -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" -k https://127.0.0.1/metrics` | |
| // Use in-cluster CA bundle for TLS verification when fetching metrics | |
| metricsCmd := `curl -s --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://127.0.0.1/metrics` |
git-ival
left a comment
There was a problem hiding this comment.
Hey @MSpencer87 👋
We should probably not have the different parts of summarize offload to separate module binaries. Can we refactor the collect-profile, export-metrics, and resource-counts modules into internal packages instead? Additionally, it would be good to have flags to optionally disable one or more of these when running dartboard summarize.
We can have separate module binaries for each of them, but the code providing the actual functionality should be contained within internal packages so that summarize and any other modules that need it can call the functions directly without relying on a separate binary.
|
We don't need to include all of these in this PR (we can iterate later), but I have some more notes after looking at this again:
|
Implements the
dartboard summarizeCLI subcommand used to gather:PR for #116
Please review carefully. Copilot and Gemini were used to assist with bash script conversions